Phonemes frequency based PLLR dimensionality reduction for language recognition
نویسندگان
چکیده
This paper presents a new approach to reduce the dimensionality of Phone Log likelihood Ratio (PLLR) features, which have been shown to be effective for language recognition, by removing the likelihoods corresponding to less frequent phonemes. In this work, phoneme frequencies are estimated using a suitable phoneme recogniser. Following this, an i-vector framework is used to represent the total variability in the reduced dimensional PLLR feature space. This paper also proposes the use of Gaussian probabilistic linear discriminant analysis (GPLDA) as a backend for Language Recognition Evaluation (LRE) tasks. The suitability of both, the proposed dimensionality reductions technique and the GPLDA back-end has been evaluated on NIST 2007 and 2011 LRE tasks. The results show that the novel dimensionality reduction method outperforms PCA based dimensionality reduction by 7%. Further the results also show that GPLDA outperform generatively trained Gaussian back-ends, which have previously been used in conjunction with PLLR feature,
منابع مشابه
Dimensionality reduction of phone log-likelihood ratio features for spoken language recognition
In a previous work, we introduced the use of log-likelihood ratios of phone posterior probabilities, called Phone LogLikelihood Ratios (PLLR) as features for language recognition under an iVector-based approach, yielding high performance and promising results. However, the high dimensionality of the PLLR feature vectors (with regard to MFCC/SDC features) results in comparatively higher computat...
متن کاملNew insight into the use of phone log-likelihood ratios as features for language recognition
Phone Log-Likelihood Ratio (PLLR) features have been recently introduced as an effective way of making use of frame-level phone posteriors in language and speaker recognition systems. In this paper, a deep insight into PLLR features is made and further evidence of the usefulness of these features in spoken language recognition tasks is provided, with a new set of experiments carried out on the ...
متن کاملThe Manifold Nature of Vowel Sounds
Recently there has been great interest in geometricallymotivated approaches to data analysis and pattern recognition. Low-dimensional structure in higher-dimensional data can be exploited by manifold-based data reduction and learning algorithms to improve performance. The existence of such a structure in speech has not been formally documented. Toward this end, I present a derivation of the app...
متن کاملPLLR features in language recognition system for RATS
In this paper, we study the use of features based on frame-byframe phone posteriors (PLLRs) for language recognition. The results are reported on the datasets developed for the DARPA RATS (Robust Automatic Transcription of Speech) program, which seeks to advance state of the art detection capabilities on audio from highly degraded communication channels. We show that systems based on the PLLRs ...
متن کاملExploiting Phone Log-Likelihood Ratio Features for the Detection of the Native Language of Non-Native English Speakers
Detecting the native language (L1) of non-native English speakers may be of great relevance in some applications, such as computer assisted language learning or IVR services. In fact, the L1 detection problem closely resembles the problem of spoken language and dialect recognition. In particular, log-likelihood ratios of phone posterior probabilities, known as Phone LogLikelihood Ratios (PLLR),...
متن کامل